Skip to content

Switch python runtime to CurrentThreadRuntime#7896

Open
robert3005 wants to merge 16 commits into
developfrom
rk/pythonchanges
Open

Switch python runtime to CurrentThreadRuntime#7896
robert3005 wants to merge 16 commits into
developfrom
rk/pythonchanges

Conversation

@robert3005
Copy link
Copy Markdown
Contributor

@robert3005 robert3005 commented May 12, 2026

We want to unify the language bindings to have the same behaviour when interacting with vortex. This pr brings python bindings in line with C and Java in using CurrentThreadRuntime by default

Vortex uses shared runtime underneath python api. When no background threads are configured the python thread drives the work on the scan. This means multiple Python threads can make progress independently as long as each thread owns the reader it is consuming

from concurrent.futures import ThreadPoolExecutor

import pyarrow.compute as pc
import vortex as vx


def sum_column(path: str, column: str) -> int | float:
    reader = vx.open(path).to_arrow([column], batch_size=64_000)
    total = 0

    for batch in reader:
        value = pc.sum(batch.column(column)).as_py()
        if value is not None:
            total += value

    return total


columns = ["tip_amount", "fare_amount", "total_amount"]
with ThreadPoolExecutor(max_workers=len(columns)) as threads:
    totals = list(threads.map(lambda column: sum_column("example.vortex", column), columns))

Alternatively users who want vortex to work in the background, independently of user level python threads, can configure worker count to desired value.

import vortex as vx
import vortex.runtime as vxrt

previous_workers = vxrt.worker_count()
vxrt.set_worker_threads_to_available_parallelism()

try:
    reader = vx.open("example.vortex").to_arrow(batch_size=64_000)
    table = reader.read_all()
finally:
    vxrt.set_worker_threads(previous_workers)

These examples are added to the docs

@robert3005 robert3005 added the changelog/break A breaking API change label May 12, 2026
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 12, 2026

Merging this PR will not alter performance

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

✅ 1216 untouched benchmarks


Comparing rk/pythonchanges (96d1d98) with develop (1f6fb0a)

Open in CodSpeed

@robert3005 robert3005 marked this pull request as ready for review May 12, 2026 21:50
@gatesn
Copy link
Copy Markdown
Contributor

gatesn commented May 12, 2026

I'm not sure we necessarily want to expose session? I was making the decision that in Python land we can just assume a single global instance?

Also would you mind adding to the PR description how multi-threading works in the new design? Do users have to have multiple python threads? Or is there a default background pool? Can I start one?

@robert3005
Copy link
Copy Markdown
Contributor Author

I think one global session is fine as long as we don't support free threaded python. In free threaded model python is more like any other language but also I don't know anyone yet using python 3.14.

@robert3005 robert3005 changed the title Expose VortexSession in python bindings and switch to CurrentThreadRuntime Switch python runtime to CurrentThreadRuntime May 13, 2026
Copy link
Copy Markdown
Contributor

@gatesn gatesn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should default to core count - 1, and allow the user to adjust from there

Comment thread docs/user-guide/vortex-python.md Outdated
Comment thread vortex-python/python/vortex/_lib/runtime.pyi Outdated
Comment thread vortex-python/python/vortex/_lib/runtime.pyi Outdated
Comment thread vortex-python/python/vortex/arrays.py Outdated
robert3005 added 15 commits May 15, 2026 00:43
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Signed-off-by: Robert Kruszewski <github@robertk.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/break A breaking API change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants